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- The MAILING DATE of this communication appears on the cover sheet with the correspondence address » 
Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- tf the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- tf NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )S Responsive to communication(s) filed on 05/02/2001 . 
2a)D This action is FINAL. 2b)M This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) S Claim(s) 1-27 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) E3 Claim(s) 1-27 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) M The specification is objected to by the Examiner. 

10) 13 The drawing(s) filed on 05/02/2001 is/are: a)D accepted or b)E3 objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

1 1) D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12) D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 1 9(a)-(d) or (f). 
a)D All b)D Some * c)D None of: 

1 .□ Certified copies of the priority documents have been received. 

2. D Certified copies of the priority documents have been received in Application No. . 

3. Q Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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Detailed Action 
Drawings 

1 . The drawings are objected to as failing to comply with 37 CFR 1 .84(p)(5) because they 
include the following reference sign(s) not mentioned in the description: Fig. 2, Element 227. 

A proposed drawing correction, corrected drawings, or amendment to the specification to 
add the reference sign(s) in the description, are required in reply to the Office action to avoid 
abandonment of the application. The objection to the drawings will not be held in abeyance. 

Specification 

2. The disclosure is objected to because of the following informalities: the term "speech 
decoder" should be replaced with —speech recognizer— or —speech recognition engine— since, as 
per the specification, no decoding of an encoded speech signal is being performed. The process 
of detecting a pause between words to further recognize individual terms in an utterance is 
identified, as is well known in the art, as speech recognition and should be noted appropriately. 
For example, Page 19, Paragraph [0064], Lines 5-9, or Page 19, Paragraph [0067], Lines 1-2. 

Appropriate correction is required. 
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Claim Objections 



3. Claim 16 is objected to because of the following informalities: "Claim 18" should be 
corrected to read —Claim 15—. 

Appropriate correction is required. 

4. Claims 13, 15, 16, and 27 are objected to because of the following informalities: the 
term "decoder" should be replaced with —speech recognizer— or —speech recognition engine— 
since, as per the specification, no decoding of an encoded speech signal is being performed. The 
process of detecting a pause between words to further recognize individual terms in an utterance 
is identified, as is well known in the art, as speech recognition and should be noted appropriately. 

The examiner has interpreted "decoder" to mean -speech recognizer — for the application 
of prior art. 

Appropriate correction is required. 



Claim Rejections - 35 USC §103 



5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

6. Claims 1-8, 10-23, and 25-27 are rejected under 35 U.S.C. 103(a) as being unpatentable 



over Zavoli et al (U.S. Patent: 6,598,016) in view of Power et al (U.S. Patent: 5,848,388). 
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With respect to Claims 1 and 13, Zavoli discloses: 

A method and system for recognizing speech in systems that accept speech input, 
comprising: 

Receiving at least a current subgroup of speech units that form part of a complete speech 
sequence that is to be input from a user (receiving part of a complete spoken digit string from a 
microphone, Col 6, Lines 52-57); 

Recognizing the speech units of the subgroup to provide a recognition result (displaying 
individual digits within a string to a user for correction/verification, Col. 6, Lines 54-57, upon 
recognition from a voice recognition module, Col. 5, Lines 24-28) ; and 

Immediately feeding back the recognition result for verification by the user (displaying 
individual digits to a user, upon recognition, for correction/verification, Col. 6, Lines 54-57). 

Zavoli does not teach the ability to detect pauses in speech, however, Power discloses: 

Detecting a natural pause between input subgroups (pause detector used to detect a pause 
following a word and further enable a parser to output a word recognition signal, Col. 4, Lines 
64-66). 

Zavoli and Power are analogous art because they are from a similar field of endeavor in 
speech-controlled interfaces capable of recognizing segments of a complete utterance. Thus, it 
would have been obvious to a person of ordinary skill in the art, at the time of invention, to 
combine the ability to detect pauses between words as taught by Power with the method and 
system utilizing recognition and display of utterance segments for correction as taught by Zavoli 
to increase speech recognition accuracy by clearly identifying pauses between the spoken digits 
to recognize individual speech segments of a complete utterance, prevent recognition error in 
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confusing inter-word pauses with an end of speech, and further correct any words within a 
sequence that may have been spoken in error or recognized incorrectly (recognition error, 
Power, Col. 2, Lines 48-64). Therefore, it would have been obvious to combine Power with 
Zavoli for the benefit of obtaining a more accurate speech recognition system capable of 
recognizing segments of a complete utterance using inter-word pause detection means, to obtain 
the invention as specified in Claims 1 and 13. 

With respect to Claims 2 and 14, Zavoli further discloses: 

A speech recognition method and system, wherein said user is only prompted to repeat 
said subgroup for re-recognition and re-verification if a rejection criteria is met ("no" command 
used to delete a previous incorrect digit so that a new digit within a sequence may be reentered. 
A correct sequence is indicated upon the utterance of a "yes" command, Col. 6, Lines 58-63). 

Also, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, to further indicate that a digit has been deleted by prompting a user to enter a new 
digit, thus making the user aware that the previous digit has been deleted and may be replaced. 

With respect to Claims 3 and 20, Zavoli further recites: 

A speech recognition method and system, further comprising: 

Repeating the steps of Claim 1 for remaining input subgroups until it is determined that 
the complete speech sequence has been recognized (recognition of digits within a string until a 
"yes" command is received, indicating sequence completion, Col .6, Lines 60-65). 

With respect to Claims 4 and 21, Zavoli teaches the system utilizing recognition and 
display of utterance segments for correction, as applied to Claim 1 . Zavoli does not teach the 
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ability to output a feedback utterance to a user through speech synthesis means or a pre-recorded 
message, however, Power discloses: 

A speech recognition method and system, wherein the last step of Claim 1 is affected 
using pre-recorded prompts or via text-to-speech synthesis, (TTS) to feedback the recognition 
result (providing a synthesized prompt and response to a user for verification of a speech input, 
Col 9, Lines 57-61). 

Zavoli and Power are analogous art because they are from a similar field of endeavor in 
speech-controlled interfaces capable of recognizing segments of a complete utterance. Thus, it 
would have been obvious to a person of ordinary skill in the art, at the time of invention, to 
combine the speech synthesis means for prompting a user to verify a speech input as taught by 
Power with the method and system enabling recognition and display of utterance segments for 
correction as taught by Zavoli in order to synthesize recognized speech segments, thus providing 
a convenient means of notifying a user of a recognition result if a user is occupied and does not 
have the means to view a text display such as in a automobile application. Also, it would have 
been obvious to one of ordinary skill in the art, at the time of invention, to use a pre-recorded 
message in place of speech synthesis, since a pre-recorded message would be an obvious 
variation of synthesis as a means of prompting a recognition result to a user. Therefore, it would 
have been obvious to combine Power with Zavoli for the benefit of obtaining a convenient 
means of notifying a user of a recognition result, to obtain the invention as specified in Claims 4 
and 21. 
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With respect to Claims 5 and 18, Zavoli adds: 

A speech recognition method and system, wherein said rejection criteria is embodied as a 
negative utterance spoken by the user after receiving the fed back recognition result ("no " 
command used to delete a previously displayed digit, Col. 6, Lines 58-60). 

With respect to Claim 6, Zavoli further discloses: 

A speech recognition method, wherein said rejection criteria is embodied as a negative 
utterance spoken by the user concurrent with inputting the subgroup that is recognized in the 
third step of Claim 1 ("no " command used to delete a previously displayed digit before a digit 
sequence is completely entered and verified with a "yes" command, Col 6, Lines 58-65). 

With respect to Claims 7 and 22, Zavoli in view of Power teaches the speech recognition 
method and system capable of recognizing individual word segments (digits) through pause 
detection means to enable, upon input of a negative utterance, correction of input and recognition 
errors, as applied to Claims 2 and 14. Neither Zavoli nor Power specifically suggest prompting a 
user to input shorter speech segments upon repeated recognition errors, however, it would have 
been obvious to one of ordinary skill in the art, at the time of invention, to prompt the user to 
input shorter speech segments because if repeated recognition errors occur, shorter utterances 
have less complex speech models and thus, logically, would provide a higher level of recognition 
accuracy. Therefore, prompting a user to input easily recognized, shorter speech segments 
would provide a well known means of increasing recognition accuracy. 

With respect to Claims 8 and 23, Zavoli in view of Power teaches the speech recognition 
method and system capable of recognizing individual word segments (digits) through pause 
detection means to enable, upon input of a negative utterance, correction of input and recognition 
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errors, as applied to Claims 2 and 14. Neither Zavoli nor Power specifically suggest prompting a 
user to input shorter speech segments upon repeated recognition errors as a means of training a 
user, however, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, that by speaking shorter and more easily recognized speech segments, a user would 
gradually learn the proper way to input recognizable utterances. For instance, a user may speak a 
string of digits too quickly to be recognized correctly. By speaking each speech segment 
individually, the speaker would be able to attempt a single utterance segment multiple times and 
gradually come to understand the proper method of producing recognizable speech. Therefore, 
prompting a user to speak smaller speech segments acts a means of training that user to properly 
input an utterance. 

With respect to Claims 10 and 25, Zavoli further discloses: 

A speech recognition method and system, wherein said speech units are selected from 
any of spoken digits, spoken letters and spoken words (spoken digit recognition, Col. 6, Lines 
54-57). 

Also, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, to implement the speech recognition method taught by Zavoli in a word and letter 
recognition application, since all types are related to recognition of a speech segment within an 
utterance, to increase the usefulness of the recognizer. Furthermore, word and letter recognition 
are obvious variations of spoken digit recognition within a sequence of digits and thus, would be 
compatible with the same system, only requiring additional speech model sets. 
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With respect to Claims 11 and 26, Zavoli further suggests: 
A speech recognition method and system, wherein input of a next subgroup after 
receiving the fed back recognition result indicates a correct recognition of the currently input 
subgroup ("no" command used to delete a previous incorrect digit so that a new digit within a 
sequence may be reentered. If a digit is correctly recognized the user will input another digit, 
thus the previous recognition result is considered correct since no negative command was input. 
The user may further verify the result once an entire sequence has been entered with a "yes" 
command, Col. 6, Lines 58-65). 

With respect to Claims 12 and 27, Zavoli teaches the method and system utilizing 
recognition and display of utterance segments for correction initialized through a negative input 
command, as applied to Claims 2 and 14. Zavoli does not teach the rejection of a speech 
segment based on a confidence level, however Power discloses: 

A speech recognition method, wherein said rejection criteria requires determining a level 
of confidence in said recognition result (rejection of a recognized speech input based on 
confidence level, Col. 9, Lines 54-57). 

Zavoli and Power are analogous art because they are from a similar field of endeavor in 
speech-controlled interfaces capable of recognizing segments of a complete utterance. Thus, it 
would have been obvious to a person of ordinary skill in the art, at the time of invention, to 
combine the method of recognition rejection based upon a confidence level as taught by Power 
with the method and system utilizing recognition and display of utterance segments for 
correction initialized through a negative input command as taught by Zavoli to provide a means 
of rejecting an invalid or mispronounced speech input that has a suspect identification, through 
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the use of a confidence level score, to ensure that an entered digit sequence is properly 
recognized, especially in a application where digit sequence accuracy is critical such as password 
entry. Therefore, it would have been obvious to combine Power with Zavoli for the benefit of 
obtaining a means of improving recognition accuracy through the use of confidence level scores, 
to obtain the invention as specified in Claims 12 and 27. 

With respect to Claim 15, Zavoli further discloses: 

A speech recognition system, wherein the decoder (speech recognizer) compares the 
input subgroup with stored recognition grammar in order to determine the recognition result 
(speech recognition module featuring a phonetic dictionary, Col 5, Lines 24-28). 

With respect to Claim 16, Zavoli additionally suggests: 

A speech recognition system, wherein the recognition grammar is stored in a remote 
memory accessible by the decoder (speech recognition module) (invention process implemented 
on a server accessed over telephone lines, Col 3, Lines 15-18). 

Thus, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, to implement a speech recognition method, utilizing a phonetic dictionary, at a server 
in order to conserve system memory in a device with limited storage. 

With respect to Claim 17, Zavoli further recites: 

A speech recognition system, wherein the recognition result includes at least one of a 
subgroup of speech units and a negative utterance representation that is included in the 
recognition result, and wherein the rejection criteria is met if the negative utterance is included 
therein (displaying individual digits to a user, upon recognition, for correction/verification, Col 
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6, Lines 54-57, and a rejection result representation displayed to a user, along with previously 
recognized digits, in the form of a deleted digit, Col 6, Lines 54-60). 

Claim 19 contains subject matter similar to Claims 6 and 17, and thus, is rejected for the 
same reasons. 

7. Claims 9 and 24 are rejected under 35 U.S.C. 103(a) as being unpatentable over Zavoli 
et al, in view of Power et al, and in further view of Larsen ("Investigating a Mixed-Initiative 
Dialogue Management Strategy, " 1997). 

With respect to Claims 9 and 24, Zavoli in view of Power teaches the speech recognition 
system capable of recognizing individual word segments (digits) through pause detection means 
to enable further correction of input and recognition errors, as applied to Claim 1 . Neither Zavoli 
nor Power teaches the ability to enter speech units using a dial pad upon repeated recognition 
errors, however Larsen discloses: 

A speech recognition method and system, wherein if said rejection criteria are met 
repeatedly, the user is prompted to use a dial pad to enter the speech units (ability to switch to 
DTMF input mode upon repeated recognition errors, Page 66-67, Application). 

Zavoli, Power, and Larsen are analogous art because they are from a similar field of 
endeavor in speech-controlled interfaces. Thus, it would have been obvious to a person of 
ordinary skill in the art, at the time of invention, to combine the ability to enter speech units in a 
DTMF input mode upon repeated recognition errors as taught by Larsen with the speech 
recognition method and system capable of recognizing individual word segments (digits) through 
pause detection means to enable further correction of input and recognition errors as taught by 
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Zavoli in view of Power to offer an alternative means of inputting information in a speech 
interface if a user becomes frustrated with repeated recognition errors. Therefore, it would have 
been obvious to combine Larsen with Zavoli in view of Power for the benefit of offering a user 
an alternative method of data entry in a speech interface upon repeated recognition errors, to 
obtain the invention as specified in Claims 9 and 24. 

Conclusion 

8. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure: 

• Mostow et al (U.S. Patent: 5,920,838)- teaches a reading tutor that responds to a 
user with a prompt when a predetermined silence period is detected. 

• Gupta et al (U.S. Patent: 5,995,926)- discloses spoken digit string recognition 
utilizing an end point detector. 

• Young et al (US. Patent: 6,064,959)- teaches error correction means in a speech 
recognition system featuring a pause detector to recognize individual words and 
partial recognition capability for error correction of continuously inputted speech. 

• Modi et al (U.S. Patent: 6,125,345)- discloses a method, compatible with digit 
string recognition, for rejecting a speech input based on a confidence score. 

• Pawiewski et al (77. & Patent: 6,389,392)- teaches that partitioning speech into 
shorter segments leads to higher recognition rates. 
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• Imai et al (U.S. Patent: 6,393,398)- discloses a speech recognition apparatus that 
provides a partial recognition of an utterance for the purpose of successively 
determining a recognition result while inputting speech. 

• Croft (U.S. Patent: 6,493,670)- teaches a speech interface that is capable of 
recognizing a sequence of digits and inter-word silence periods, and further 
features a prompt providing recognition status and a means for playing back a 
recognized digit for user conformation. 

9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to James S. Wozniak whose telephone number is (703) 305-8669 
and email is James.Wozniak@uspto.gov. The examiner can normally be reached on Mondays- 
Fridays, 8:30-4:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Ivars Smits can be reached at (703) 306-301 1. The fax/phone number for 
the Technology Center 2600 where this application is assigned is (703) 872-9306. 

Any inquiry of a general nature or relating to the status of this application or proceeding 
should be directed to the technology center receptionist whose telephone number is (703) 306- 
0377. 



James S. Wozniak 
4/2/2004 




PRIMARY EXAMINER 



